解决利用keras的InceptionV3、ResNet50模型做迁移学习训练集和验证集的准确率相差很大的问题 您所在的位置:网站首页 keras resnet50 迁移 解决利用keras的InceptionV3、ResNet50模型做迁移学习训练集和验证集的准确率相差很大的问题

解决利用keras的InceptionV3、ResNet50模型做迁移学习训练集和验证集的准确率相差很大的问题

#解决利用keras的InceptionV3、ResNet50模型做迁移学习训练集和验证集的准确率相差很大的问题| 来源: 网络整理| 查看: 265

kaggle的人类蛋白图谱图像分类的比赛告一段落了,终于有时间闲下来写写这一路走来填的坑了。

keras的版本是2.2.4

有没有小伙伴遇到过用keras的InceptionV3、ResNet50等含有BN层的模型做迁移学习训练集和验证集结果相差很大的问题,例如下面这样:

Epoch 1/20 1500/1500 [==============================] - 24s 16ms/step - loss: 2.1168 - binary_accuracy: 0.9169 - f1_keras: 0.0617 - val_loss: 2.2727 - val_binary_accuracy: 0.9258 - val_f1_keras: 0.0377 Epoch 2/20 1500/1500 [==============================] - 19s 13ms/step - loss: 1.1976 - binary_accuracy: 0.9480 - f1_keras: 0.1084 - val_loss: 2.4163 - val_binary_accuracy: 0.9218 - val_f1_keras: 0.0356 Epoch 3/20 1500/1500 [==============================] - 19s 13ms/step - loss: 0.9935 - binary_accuracy: 0.9540 - f1_keras: 0.1608 - val_loss: 2.7485 - val_binary_accuracy: 0.9114 - val_f1_keras: 0.0359 Epoch 4/20 1500/1500 [==============================] - 19s 13ms/step - loss: 0.8294 - binary_accuracy: 0.9572 - f1_keras: 0.1902 - val_loss: 2.9039 - val_binary_accuracy: 0.9166 - val_f1_keras: 0.0402 Epoch 5/20 1500/1500 [==============================] - 19s 13ms/step - loss: 0.7250 - binary_accuracy: 0.9606 - f1_keras: 0.2482 - val_loss: 3.1574 - val_binary_accuracy: 0.9057 - val_f1_keras: 0.0485

可以看出,模型的训练集loss在一直减小,但是验证集的loss却一直增大,而且验证集的准确率和f1分数也与训练集的结果大相径庭。有小伙伴会怀疑会不会是过拟合了,楼主也曾这样怀疑过,所以楼主将验证集用训练集代替又做了次实验,也就是说训练集和验证集都是相同的样本集,这样一来得到的预期结果应该是训练集和验证集的结果都应该相同才对。但是却得到了跟上面几乎相同的结果。

楼主又用Vgg-19模型代替InceptionV3做了相同的实验,Vgg-19等不含有BN层的模型并未出现此问题。因此楼主怀疑是BN层搞得鬼,通过查找资料发现问题出在了建造模型的代码上。先给出错误的模型建造的代码(我个人的愚见,若我讲的不对,希望大神能够指出),下面的代码是keras官方给出的,楼主上面的结果就是用这个建造模型的代码结构(结构是一样的,内容稍有差别)跑出来的。

from keras.applications.inception_v3 import InceptionV3 from keras.preprocessing import image from keras.models import Model from keras.layers import Dense, GlobalAveragePooling2D from keras import backend as K # create the base pre-trained model base_model = InceptionV3(weights='imagenet', include_top=False) # add a global spatial average pooling layer x = base_model.output x = GlobalAveragePooling2D()(x) # let's add a fully-connected layer x = Dense(1024, activation='relu')(x) # and a logistic layer -- let's say we have 200 classes predictions = Dense(200, activation='softmax')(x) # this is the model we will train model = Model(inputs=base_model.input, outputs=predictions) # first: train only the top layers (which were randomly initialized) # i.e. freeze all convolutional InceptionV3 layers for layer in base_model.layers: layer.trainable = False # compile the model (should be done *after* setting layers to non-trainable) model.compile(optimizer='rmsprop', loss='categorical_crossentropy') # train the model on the new data for a few epochs model.fit_generator(...)

运行下model.summary()看一下模型结构:

activation_20 (Activation) (None, None, None, 6 0 batch_normalization_20[0][0] __________________________________________________________________________________________________ activation_22 (Activation) (None, None, None, 6 0 batch_normalization_22[0][0] __________________________________________________________________________________________________ activation_25 (Activation) (None, None, None, 9 0 batch_normalization_25[0][0] __________________________________________________________________________________________________ activation_26 (Activation) (None, None, None, 6 0 batch_normalization_26[0][0] __________________________________________________________________________________________________ mixed2 (Concatenate) (None, None, None, 2 0 activation_20[0][0] activation_22[0][0] activation_25[0][0] activation_26[0][0] __________________________________________________________________________________________________ conv2d_28 (Conv2D) (None, None, None, 6 18432 mixed2[0][0] __________________________________________________________________________________________________ batch_normalization_28 (BatchNo (None, None, None, 6 192 conv2d_28[0][0] __________________________________________________________________________________________________ activation_28 (Activation) (None, None, None, 6 0 batch_normalization_28[0][0] __________________________________________________________________________________________________ conv2d_29 (Conv2D) (None, None, None, 9 55296 activation_28[0][0] __________________________________________________________________________________________________ batch_normalization_29 (BatchNo (None, None, None, 9 288 conv2d_29[0][0] __________________________________________________________________________________________________ activation_29 (Activation) (None, None, None, 9 0 batch_normalization_29[0][0] __________________________________________________________________________________________________ conv2d_27 (Conv2D) (None, None, None, 3 995328 mixed2[0][0] __________________________________________________________________________________________________ conv2d_30 (Conv2D) (None, None, None, 9 82944 activation_29[0][0] __________________________________________________________________________________________________ batch_normalization_27 (BatchNo (None, None, None, 3 1152 conv2d_27[0][0] __________________________________________________________________________________________________ batch_normalization_30 (BatchNo (None, None, None, 9 288 conv2d_30[0][0] __________________________________________________________________________________________________ activation_27 (Activation) (None, None, None, 3 0 batch_normalization_27[0][0] __________________________________________________________________________________________________ activation_30 (Activation) (None, None, None, 9 0 batch_normalization_30[0][0] __________________________________________________________________________________________________ max_pooling2d_3 (MaxPooling2D) (None, None, None, 2 0 mixed2[0][0] __________________________________________________________________________________________________ mixed3 (Concatenate) (None, None, None, 7 0 activation_27[0][0] activation_30[0][0] max_pooling2d_3[0][0] __________________________________________________________________________________________________ conv2d_35 (Conv2D) (None, None, None, 1 98304 mixed3[0][0] __________________________________________________________________________________________________ batch_normalization_35 (BatchNo (None, None, None, 1 384 conv2d_35[0][0] __________________________________________________________________________________________________ activation_35 (Activation) (None, None, None, 1 0 batch_normalization_35[0][0] __________________________________________________________________________________________________ conv2d_36 (Conv2D) (None, None, None, 1 114688 activation_35[0][0] __________________________________________________________________________________________________ batch_normalization_36 (BatchNo (None, None, None, 1 384 conv2d_36[0][0] __________________________________________________________________________________________________ activation_36 (Activation) (None, None, None, 1 0 batch_normalization_36[0][0] __________________________________________________________________________________________________ conv2d_32 (Conv2D) (None, None, None, 1 98304 mixed3[0][0] __________________________________________________________________________________________________ conv2d_37 (Conv2D) (None, None, None, 1 114688 activation_36[0][0] __________________________________________________________________________________________________ batch_normalization_32 (BatchNo (None, None, None, 1 384 conv2d_32[0][0] __________________________________________________________________________________________________ batch_normalization_37 (BatchNo (None, None, None, 1 384 conv2d_37[0][0] __________________________________________________________________________________________________ activation_32 (Activation) (None, None, None, 1 0 batch_normalization_32[0][0] __________________________________________________________________________________________________ activation_37 (Activation) (None, None, None, 1 0 batch_normalization_37[0][0] __________________________________________________________________________________________________ conv2d_33 (Conv2D) (None, None, None, 1 114688 activation_32[0][0] __________________________________________________________________________________________________ conv2d_38 (Conv2D) (None, None, None, 1 114688 activation_37[0][0] __________________________________________________________________________________________________ batch_normalization_33 (BatchNo (None, None, None, 1 384 conv2d_33[0][0] __________________________________________________________________________________________________ batch_normalization_38 (BatchNo (None, None, None, 1 384 conv2d_38[0][0] __________________________________________________________________________________________________ activation_33 (Activation) (None, None, None, 1 0 batch_normalization_33[0][0] __________________________________________________________________________________________________ activation_38 (Activation) (None, None, None, 1 0 batch_normalization_38[0][0] __________________________________________________________________________________________________ average_pooling2d_4 (AveragePoo (None, None, None, 7 0 mixed3[0][0] __________________________________________________________________________________________________ conv2d_31 (Conv2D) (None, None, None, 1 147456 mixed3[0][0] __________________________________________________________________________________________________ conv2d_34 (Conv2D) (None, None, None, 1 172032 activation_33[0][0] __________________________________________________________________________________________________ conv2d_39 (Conv2D) (None, None, None, 1 172032 activation_38[0][0] __________________________________________________________________________________________________ conv2d_40 (Conv2D) (None, None, None, 1 147456 average_pooling2d_4[0][0] __________________________________________________________________________________________________ batch_normalization_31 (BatchNo (None, None, None, 1 576 conv2d_31[0][0] __________________________________________________________________________________________________ batch_normalization_34 (BatchNo (None, None, None, 1 576 conv2d_34[0][0] __________________________________________________________________________________________________ batch_normalization_39 (BatchNo (None, None, None, 1 576 conv2d_39[0][0] __________________________________________________________________________________________________ batch_normalization_40 (BatchNo (None, None, None, 1 576 conv2d_40[0][0] __________________________________________________________________________________________________ activation_31 (Activation) (None, None, None, 1 0 batch_normalization_31[0][0] __________________________________________________________________________________________________ activation_34 (Activation) (None, None, None, 1 0 batch_normalization_34[0][0] __________________________________________________________________________________________________ activation_39 (Activation) (None, None, None, 1 0 batch_normalization_39[0][0] __________________________________________________________________________________________________ activation_40 (Activation) (None, None, None, 1 0 batch_normalization_40[0][0] __________________________________________________________________________________________________ mixed4 (Concatenate) (None, None, None, 7 0 activation_31[0][0] activation_34[0][0] activation_39[0][0] activation_40[0][0] __________________________________________________________________________________________________ conv2d_45 (Conv2D) (None, None, None, 1 122880 mixed4[0][0] __________________________________________________________________________________________________ batch_normalization_45 (BatchNo (None, None, None, 1 480 conv2d_45[0][0] __________________________________________________________________________________________________ activation_45 (Activation) (None, None, None, 1 0 batch_normalization_45[0][0] __________________________________________________________________________________________________ conv2d_46 (Conv2D) (None, None, None, 1 179200 activation_45[0][0] __________________________________________________________________________________________________ batch_normalization_46 (BatchNo (None, None, None, 1 480 conv2d_46[0][0] __________________________________________________________________________________________________ activation_46 (Activation) (None, None, None, 1 0 batch_normalization_46[0][0] __________________________________________________________________________________________________ conv2d_42 (Conv2D) (None, None, None, 1 122880 mixed4[0][0] __________________________________________________________________________________________________ conv2d_47 (Conv2D) (None, None, None, 1 179200 activation_46[0][0] __________________________________________________________________________________________________ batch_normalization_42 (BatchNo (None, None, None, 1 480 conv2d_42[0][0] __________________________________________________________________________________________________ batch_normalization_47 (BatchNo (None, None, None, 1 480 conv2d_47[0][0] __________________________________________________________________________________________________ activation_42 (Activation) (None, None, None, 1 0 batch_normalization_42[0][0] __________________________________________________________________________________________________ activation_47 (Activation) (None, None, None, 1 0 batch_normalization_47[0][0] __________________________________________________________________________________________________ conv2d_43 (Conv2D) (None, None, None, 1 179200 activation_42[0][0] __________________________________________________________________________________________________ conv2d_48 (Conv2D) (None, None, None, 1 179200 activation_47[0][0] __________________________________________________________________________________________________ batch_normalization_43 (BatchNo (None, None, None, 1 480 conv2d_43[0][0] __________________________________________________________________________________________________ batch_normalization_48 (BatchNo (None, None, None, 1 480 conv2d_48[0][0] __________________________________________________________________________________________________ activation_43 (Activation) (None, None, None, 1 0 batch_normalization_43[0][0] __________________________________________________________________________________________________ activation_48 (Activation) (None, None, None, 1 0 batch_normalization_48[0][0] __________________________________________________________________________________________________ average_pooling2d_5 (AveragePoo (None, None, None, 7 0 mixed4[0][0] __________________________________________________________________________________________________ conv2d_41 (Conv2D) (None, None, None, 1 147456 mixed4[0][0] __________________________________________________________________________________________________ conv2d_44 (Conv2D) (None, None, None, 1 215040 activation_43[0][0] __________________________________________________________________________________________________ conv2d_49 (Conv2D) (None, None, None, 1 215040 activation_48[0][0] __________________________________________________________________________________________________ conv2d_50 (Conv2D) (None, None, None, 1 147456 average_pooling2d_5[0][0] __________________________________________________________________________________________________ batch_normalization_41 (BatchNo (None, None, None, 1 576 conv2d_41[0][0] __________________________________________________________________________________________________ batch_normalization_44 (BatchNo (None, None, None, 1 576 conv2d_44[0][0] __________________________________________________________________________________________________ batch_normalization_49 (BatchNo (None, None, None, 1 576 conv2d_49[0][0] __________________________________________________________________________________________________ batch_normalization_50 (BatchNo (None, None, None, 1 576 conv2d_50[0][0] __________________________________________________________________________________________________ activation_41 (Activation) (None, None, None, 1 0 batch_normalization_41[0][0] __________________________________________________________________________________________________ activation_44 (Activation) (None, None, None, 1 0 batch_normalization_44[0][0] __________________________________________________________________________________________________ activation_49 (Activation) (None, None, None, 1 0 batch_normalization_49[0][0] __________________________________________________________________________________________________ activation_50 (Activation) (None, None, None, 1 0 batch_normalization_50[0][0] __________________________________________________________________________________________________ mixed5 (Concatenate) (None, None, None, 7 0 activation_41[0][0] activation_44[0][0] activation_49[0][0] activation_50[0][0] __________________________________________________________________________________________________ conv2d_55 (Conv2D) (None, None, None, 1 122880 mixed5[0][0] __________________________________________________________________________________________________ batch_normalization_55 (BatchNo (None, None, None, 1 480 conv2d_55[0][0] __________________________________________________________________________________________________ activation_55 (Activation) (None, None, None, 1 0 batch_normalization_55[0][0] __________________________________________________________________________________________________ conv2d_56 (Conv2D) (None, None, None, 1 179200 activation_55[0][0] __________________________________________________________________________________________________ batch_normalization_56 (BatchNo (None, None, None, 1 480 conv2d_56[0][0] __________________________________________________________________________________________________ activation_56 (Activation) (None, None, None, 1 0 batch_normalization_56[0][0] __________________________________________________________________________________________________ conv2d_52 (Conv2D) (None, None, None, 1 122880 mixed5[0][0] __________________________________________________________________________________________________ conv2d_57 (Conv2D) (None, None, None, 1 179200 activation_56[0][0] __________________________________________________________________________________________________ batch_normalization_52 (BatchNo (None, None, None, 1 480 conv2d_52[0][0] __________________________________________________________________________________________________ batch_normalization_57 (BatchNo (None, None, None, 1 480 conv2d_57[0][0] __________________________________________________________________________________________________ activation_52 (Activation) (None, None, None, 1 0 batch_normalization_52[0][0] __________________________________________________________________________________________________ activation_57 (Activation) (None, None, None, 1 0 batch_normalization_57[0][0] __________________________________________________________________________________________________ conv2d_53 (Conv2D) (None, None, None, 1 179200 activation_52[0][0] __________________________________________________________________________________________________ conv2d_58 (Conv2D) (None, None, None, 1 179200 activation_57[0][0] __________________________________________________________________________________________________ batch_normalization_53 (BatchNo (None, None, None, 1 480 conv2d_53[0][0] __________________________________________________________________________________________________ batch_normalization_58 (BatchNo (None, None, None, 1 480 conv2d_58[0][0] __________________________________________________________________________________________________ activation_53 (Activation) (None, None, None, 1 0 batch_normalization_53[0][0] __________________________________________________________________________________________________ activation_58 (Activation) (None, None, None, 1 0 batch_normalization_58[0][0] __________________________________________________________________________________________________ average_pooling2d_6 (AveragePoo (None, None, None, 7 0 mixed5[0][0] __________________________________________________________________________________________________ conv2d_51 (Conv2D) (None, None, None, 1 147456 mixed5[0][0] __________________________________________________________________________________________________ conv2d_54 (Conv2D) (None, None, None, 1 215040 activation_53[0][0] __________________________________________________________________________________________________ conv2d_59 (Conv2D) (None, None, None, 1 215040 activation_58[0][0] __________________________________________________________________________________________________ conv2d_60 (Conv2D) (None, None, None, 1 147456 average_pooling2d_6[0][0] __________________________________________________________________________________________________ batch_normalization_51 (BatchNo (None, None, None, 1 576 conv2d_51[0][0] __________________________________________________________________________________________________ batch_normalization_54 (BatchNo (None, None, None, 1 576 conv2d_54[0][0] __________________________________________________________________________________________________ batch_normalization_59 (BatchNo (None, None, None, 1 576 conv2d_59[0][0] __________________________________________________________________________________________________ batch_normalization_60 (BatchNo (None, None, None, 1 576 conv2d_60[0][0] __________________________________________________________________________________________________ activation_51 (Activation) (None, None, None, 1 0 batch_normalization_51[0][0] __________________________________________________________________________________________________ activation_54 (Activation) (None, None, None, 1 0 batch_normalization_54[0][0] __________________________________________________________________________________________________ activation_59 (Activation) (None, None, None, 1 0 batch_normalization_59[0][0] __________________________________________________________________________________________________ activation_60 (Activation) (None, None, None, 1 0 batch_normalization_60[0][0] __________________________________________________________________________________________________ mixed6 (Concatenate) (None, None, None, 7 0 activation_51[0][0] activation_54[0][0] activation_59[0][0] activation_60[0][0] __________________________________________________________________________________________________ conv2d_65 (Conv2D) (None, None, None, 1 147456 mixed6[0][0] __________________________________________________________________________________________________ batch_normalization_65 (BatchNo (None, None, None, 1 576 conv2d_65[0][0] __________________________________________________________________________________________________ activation_65 (Activation) (None, None, None, 1 0 batch_normalization_65[0][0] __________________________________________________________________________________________________ conv2d_66 (Conv2D) (None, None, None, 1 258048 activation_65[0][0] __________________________________________________________________________________________________ batch_normalization_66 (BatchNo (None, None, None, 1 576 conv2d_66[0][0] __________________________________________________________________________________________________ activation_66 (Activation) (None, None, None, 1 0 batch_normalization_66[0][0] __________________________________________________________________________________________________ conv2d_62 (Conv2D) (None, None, None, 1 147456 mixed6[0][0] __________________________________________________________________________________________________ conv2d_67 (Conv2D) (None, None, None, 1 258048 activation_66[0][0] __________________________________________________________________________________________________ batch_normalization_62 (BatchNo (None, None, None, 1 576 conv2d_62[0][0] __________________________________________________________________________________________________ batch_normalization_67 (BatchNo (None, None, None, 1 576 conv2d_67[0][0] __________________________________________________________________________________________________ activation_62 (Activation) (None, None, None, 1 0 batch_normalization_62[0][0] __________________________________________________________________________________________________ activation_67 (Activation) (None, None, None, 1 0 batch_normalization_67[0][0] __________________________________________________________________________________________________ conv2d_63 (Conv2D) (None, None, None, 1 258048 activation_62[0][0] __________________________________________________________________________________________________ conv2d_68 (Conv2D) (None, None, None, 1 258048 activation_67[0][0] __________________________________________________________________________________________________ batch_normalization_63 (BatchNo (None, None, None, 1 576 conv2d_63[0][0] __________________________________________________________________________________________________ batch_normalization_68 (BatchNo (None, None, None, 1 576 conv2d_68[0][0] __________________________________________________________________________________________________ activation_63 (Activation) (None, None, None, 1 0 batch_normalization_63[0][0] __________________________________________________________________________________________________ activation_68 (Activation) (None, None, None, 1 0 batch_normalization_68[0][0] __________________________________________________________________________________________________ average_pooling2d_7 (AveragePoo (None, None, None, 7 0 mixed6[0][0] __________________________________________________________________________________________________ conv2d_61 (Conv2D) (None, None, None, 1 147456 mixed6[0][0] __________________________________________________________________________________________________ conv2d_64 (Conv2D) (None, None, None, 1 258048 activation_63[0][0] __________________________________________________________________________________________________ conv2d_69 (Conv2D) (None, None, None, 1 258048 activation_68[0][0] __________________________________________________________________________________________________ conv2d_70 (Conv2D) (None, None, None, 1 147456 average_pooling2d_7[0][0] __________________________________________________________________________________________________ batch_normalization_61 (BatchNo (None, None, None, 1 576 conv2d_61[0][0] __________________________________________________________________________________________________ batch_normalization_64 (BatchNo (None, None, None, 1 576 conv2d_64[0][0] __________________________________________________________________________________________________ batch_normalization_69 (BatchNo (None, None, None, 1 576 conv2d_69[0][0] __________________________________________________________________________________________________ batch_normalization_70 (BatchNo (None, None, None, 1 576 conv2d_70[0][0] __________________________________________________________________________________________________ activation_61 (Activation) (None, None, None, 1 0 batch_normalization_61[0][0] __________________________________________________________________________________________________ activation_64 (Activation) (None, None, None, 1 0 batch_normalization_64[0][0] __________________________________________________________________________________________________ activation_69 (Activation) (None, None, None, 1 0 batch_normalization_69[0][0] __________________________________________________________________________________________________ activation_70 (Activation) (None, None, None, 1 0 batch_normalization_70[0][0] __________________________________________________________________________________________________ mixed7 (Concatenate) (None, None, None, 7 0 activation_61[0][0] activation_64[0][0] activation_69[0][0] activation_70[0][0] __________________________________________________________________________________________________ conv2d_73 (Conv2D) (None, None, None, 1 147456 mixed7[0][0] __________________________________________________________________________________________________ batch_normalization_73 (BatchNo (None, None, None, 1 576 conv2d_73[0][0] __________________________________________________________________________________________________ activation_73 (Activation) (None, None, None, 1 0 batch_normalization_73[0][0] __________________________________________________________________________________________________ conv2d_74 (Conv2D) (None, None, None, 1 258048 activation_73[0][0] __________________________________________________________________________________________________ batch_normalization_74 (BatchNo (None, None, None, 1 576 conv2d_74[0][0] __________________________________________________________________________________________________ activation_74 (Activation) (None, None, None, 1 0 batch_normalization_74[0][0] __________________________________________________________________________________________________ conv2d_71 (Conv2D) (None, None, None, 1 147456 mixed7[0][0] __________________________________________________________________________________________________ conv2d_75 (Conv2D) (None, None, None, 1 258048 activation_74[0][0] __________________________________________________________________________________________________ batch_normalization_71 (BatchNo (None, None, None, 1 576 conv2d_71[0][0] __________________________________________________________________________________________________ batch_normalization_75 (BatchNo (None, None, None, 1 576 conv2d_75[0][0] __________________________________________________________________________________________________ activation_71 (Activation) (None, None, None, 1 0 batch_normalization_71[0][0] __________________________________________________________________________________________________ activation_75 (Activation) (None, None, None, 1 0 batch_normalization_75[0][0] __________________________________________________________________________________________________ conv2d_72 (Conv2D) (None, None, None, 3 552960 activation_71[0][0] __________________________________________________________________________________________________ conv2d_76 (Conv2D) (None, None, None, 1 331776 activation_75[0][0] __________________________________________________________________________________________________ batch_normalization_72 (BatchNo (None, None, None, 3 960 conv2d_72[0][0] __________________________________________________________________________________________________ batch_normalization_76 (BatchNo (None, None, None, 1 576 conv2d_76[0][0] __________________________________________________________________________________________________ activation_72 (Activation) (None, None, None, 3 0 batch_normalization_72[0][0] __________________________________________________________________________________________________ activation_76 (Activation) (None, None, None, 1 0 batch_normalization_76[0][0] __________________________________________________________________________________________________ max_pooling2d_4 (MaxPooling2D) (None, None, None, 7 0 mixed7[0][0] __________________________________________________________________________________________________ mixed8 (Concatenate) (None, None, None, 1 0 activation_72[0][0] activation_76[0][0] max_pooling2d_4[0][0] __________________________________________________________________________________________________ conv2d_81 (Conv2D) (None, None, None, 4 573440 mixed8[0][0] __________________________________________________________________________________________________ batch_normalization_81 (BatchNo (None, None, None, 4 1344 conv2d_81[0][0] __________________________________________________________________________________________________ activation_81 (Activation) (None, None, None, 4 0 batch_normalization_81[0][0] __________________________________________________________________________________________________ conv2d_78 (Conv2D) (None, None, None, 3 491520 mixed8[0][0] __________________________________________________________________________________________________ conv2d_82 (Conv2D) (None, None, None, 3 1548288 activation_81[0][0] __________________________________________________________________________________________________ batch_normalization_78 (BatchNo (None, None, None, 3 1152 conv2d_78[0][0] __________________________________________________________________________________________________ batch_normalization_82 (BatchNo (None, None, None, 3 1152 conv2d_82[0][0] __________________________________________________________________________________________________ activation_78 (Activation) (None, None, None, 3 0 batch_normalization_78[0][0] __________________________________________________________________________________________________ activation_82 (Activation) (None, None, None, 3 0 batch_normalization_82[0][0] __________________________________________________________________________________________________ conv2d_79 (Conv2D) (None, None, None, 3 442368 activation_78[0][0] __________________________________________________________________________________________________ conv2d_80 (Conv2D) (None, None, None, 3 442368 activation_78[0][0] __________________________________________________________________________________________________ conv2d_83 (Conv2D) (None, None, None, 3 442368 activation_82[0][0] __________________________________________________________________________________________________ conv2d_84 (Conv2D) (None, None, None, 3 442368 activation_82[0][0] __________________________________________________________________________________________________ average_pooling2d_8 (AveragePoo (None, None, None, 1 0 mixed8[0][0] __________________________________________________________________________________________________ conv2d_77 (Conv2D) (None, None, None, 3 409600 mixed8[0][0] __________________________________________________________________________________________________ batch_normalization_79 (BatchNo (None, None, None, 3 1152 conv2d_79[0][0] __________________________________________________________________________________________________ batch_normalization_80 (BatchNo (None, None, None, 3 1152 conv2d_80[0][0] __________________________________________________________________________________________________ batch_normalization_83 (BatchNo (None, None, None, 3 1152 conv2d_83[0][0] __________________________________________________________________________________________________ batch_normalization_84 (BatchNo (None, None, None, 3 1152 conv2d_84[0][0] __________________________________________________________________________________________________ conv2d_85 (Conv2D) (None, None, None, 1 245760 average_pooling2d_8[0][0] __________________________________________________________________________________________________ batch_normalization_77 (BatchNo (None, None, None, 3 960 conv2d_77[0][0] __________________________________________________________________________________________________ activation_79 (Activation) (None, None, None, 3 0 batch_normalization_79[0][0] __________________________________________________________________________________________________ activation_80 (Activation) (None, None, None, 3 0 batch_normalization_80[0][0] __________________________________________________________________________________________________ activation_83 (Activation) (None, None, None, 3 0 batch_normalization_83[0][0] __________________________________________________________________________________________________ activation_84 (Activation) (None, None, None, 3 0 batch_normalization_84[0][0] __________________________________________________________________________________________________ batch_normalization_85 (BatchNo (None, None, None, 1 576 conv2d_85[0][0] __________________________________________________________________________________________________ activation_77 (Activation) (None, None, None, 3 0 batch_normalization_77[0][0] __________________________________________________________________________________________________ mixed9_0 (Concatenate) (None, None, None, 7 0 activation_79[0][0] activation_80[0][0] __________________________________________________________________________________________________ concatenate_1 (Concatenate) (None, None, None, 7 0 activation_83[0][0] activation_84[0][0] __________________________________________________________________________________________________ activation_85 (Activation) (None, None, None, 1 0 batch_normalization_85[0][0] __________________________________________________________________________________________________ mixed9 (Concatenate) (None, None, None, 2 0 activation_77[0][0] mixed9_0[0][0] concatenate_1[0][0] activation_85[0][0] __________________________________________________________________________________________________ conv2d_90 (Conv2D) (None, None, None, 4 917504 mixed9[0][0] __________________________________________________________________________________________________ batch_normalization_90 (BatchNo (None, None, None, 4 1344 conv2d_90[0][0] __________________________________________________________________________________________________ activation_90 (Activation) (None, None, None, 4 0 batch_normalization_90[0][0] __________________________________________________________________________________________________ conv2d_87 (Conv2D) (None, None, None, 3 786432 mixed9[0][0] __________________________________________________________________________________________________ conv2d_91 (Conv2D) (None, None, None, 3 1548288 activation_90[0][0] __________________________________________________________________________________________________ batch_normalization_87 (BatchNo (None, None, None, 3 1152 conv2d_87[0][0] __________________________________________________________________________________________________ batch_normalization_91 (BatchNo (None, None, None, 3 1152 conv2d_91[0][0] __________________________________________________________________________________________________ activation_87 (Activation) (None, None, None, 3 0 batch_normalization_87[0][0] __________________________________________________________________________________________________ activation_91 (Activation) (None, None, None, 3 0 batch_normalization_91[0][0] __________________________________________________________________________________________________ conv2d_88 (Conv2D) (None, None, None, 3 442368 activation_87[0][0] __________________________________________________________________________________________________ conv2d_89 (Conv2D) (None, None, None, 3 442368 activation_87[0][0] __________________________________________________________________________________________________ conv2d_92 (Conv2D) (None, None, None, 3 442368 activation_91[0][0] __________________________________________________________________________________________________ conv2d_93 (Conv2D) (None, None, None, 3 442368 activation_91[0][0] __________________________________________________________________________________________________ average_pooling2d_9 (AveragePoo (None, None, None, 2 0 mixed9[0][0] __________________________________________________________________________________________________ conv2d_86 (Conv2D) (None, None, None, 3 655360 mixed9[0][0] __________________________________________________________________________________________________ batch_normalization_88 (BatchNo (None, None, None, 3 1152 conv2d_88[0][0] __________________________________________________________________________________________________ batch_normalization_89 (BatchNo (None, None, None, 3 1152 conv2d_89[0][0] __________________________________________________________________________________________________ batch_normalization_92 (BatchNo (None, None, None, 3 1152 conv2d_92[0][0] __________________________________________________________________________________________________ batch_normalization_93 (BatchNo (None, None, None, 3 1152 conv2d_93[0][0] __________________________________________________________________________________________________ conv2d_94 (Conv2D) (None, None, None, 1 393216 average_pooling2d_9[0][0] __________________________________________________________________________________________________ batch_normalization_86 (BatchNo (None, None, None, 3 960 conv2d_86[0][0] __________________________________________________________________________________________________ activation_88 (Activation) (None, None, None, 3 0 batch_normalization_88[0][0] __________________________________________________________________________________________________ activation_89 (Activation) (None, None, None, 3 0 batch_normalization_89[0][0] __________________________________________________________________________________________________ activation_92 (Activation) (None, None, None, 3 0 batch_normalization_92[0][0] __________________________________________________________________________________________________ activation_93 (Activation) (None, None, None, 3 0 batch_normalization_93[0][0] __________________________________________________________________________________________________ batch_normalization_94 (BatchNo (None, None, None, 1 576 conv2d_94[0][0] __________________________________________________________________________________________________ activation_86 (Activation) (None, None, None, 3 0 batch_normalization_86[0][0] __________________________________________________________________________________________________ mixed9_1 (Concatenate) (None, None, None, 7 0 activation_88[0][0] activation_89[0][0] __________________________________________________________________________________________________ concatenate_2 (Concatenate) (None, None, None, 7 0 activation_92[0][0] activation_93[0][0] __________________________________________________________________________________________________ activation_94 (Activation) (None, None, None, 1 0 batch_normalization_94[0][0] __________________________________________________________________________________________________ mixed10 (Concatenate) (None, None, None, 2 0 activation_86[0][0] mixed9_1[0][0] concatenate_2[0][0] activation_94[0][0] __________________________________________________________________________________________________ global_average_pooling2d_1 (Glo (None, 2048) 0 mixed10[0][0] __________________________________________________________________________________________________ dense_1 (Dense) (None, 1024) 2098176 global_average_pooling2d_1[0][0] __________________________________________________________________________________________________ dense_2 (Dense) (None, 200) 205000 dense_1[0][0] ================================================================================================== Total params: 24,105,960 Trainable params: 2,303,176 Non-trainable params: 21,802,784 __________________________________________________________________________________________________

你的迁移学习模型结构如果是这样,就说明有问题了。

将上面的代码修改成这样就可以了:

from keras.applications.inception_v3 import InceptionV3 from keras.preprocessing import image from keras.models import Model from keras.layers import Dense, GlobalAveragePooling2D, Input from keras import backend as K # create the base pre-trained model Inp = Input((224, 224, 3)) base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(224,224,3)) x = base_model(Inp) # add a global spatial average pooling layer x = GlobalAveragePooling2D()(x) # let's add a fully-connected layer x = Dense(1024, activation='relu')(x) # and a logistic layer -- let's say we have 200 classes predictions = Dense(200, activation='softmax')(x) # this is the model we will train model = Model(inputs=Inp, outputs=predictions) # first: train only the top layers (which were randomly initialized) # i.e. freeze all convolutional InceptionV3 layers for layer in base_model.layers: layer.trainable = False # compile the model (should be done *after* setting layers to non-trainable) model.compile(optimizer='rmsprop', loss='categorical_crossentropy') # train the model on the new data for a few epochs model.fit_generator(...)

运行下model.summary()再看一下模型结构:

_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_2 (InputLayer) (None, 224, 224, 3) 0 _________________________________________________________________ inception_v3 (Model) (None, 5, 5, 2048) 21802784 _________________________________________________________________ global_average_pooling2d_2 ( (None, 2048) 0 _________________________________________________________________ dense_3 (Dense) (None, 1024) 2098176 _________________________________________________________________ dense_4 (Dense) (None, 200) 205000 ================================================================= Total params: 24,105,960 Trainable params: 2,303,176 Non-trainable params: 21,802,784 _________________________________________________________________

看一下正确的结果:

Epoch 1/20 1500/1500 [==============================] - 27s 18ms/step - loss: 2.4664 - binary_accuracy: 0.9125 - f1_keras: 0.0521 - val_loss: 1.4697 - val_binary_accuracy: 0.9456 - val_f1_keras: 0.0619 Epoch 2/20 1500/1500 [==============================] - 19s 13ms/step - loss: 1.2806 - binary_accuracy: 0.9467 - f1_keras: 0.0795 - val_loss: 1.2819 - val_binary_accuracy: 0.9466 - val_f1_keras: 0.0839 Epoch 3/20 1500/1500 [==============================] - 19s 13ms/step - loss: 1.0431 - binary_accuracy: 0.9526 - f1_keras: 0.1203 - val_loss: 1.3012 - val_binary_accuracy: 0.9468 - val_f1_keras: 0.0908 Epoch 4/20 1500/1500 [==============================] - 19s 13ms/step - loss: 0.9168 - binary_accuracy: 0.9555 - f1_keras: 0.1493 - val_loss: 1.3257 - val_binary_accuracy: 0.9445 - val_f1_keras: 0.0922 Epoch 5/20 1500/1500 [==============================] - 19s 13ms/step - loss: 0.8281 - binary_accuracy: 0.9577 - f1_keras: 0.1959 - val_loss: 1.3123 - val_binary_accuracy: 0.9468 - val_f1_keras: 0.0969

可以看出验证集的准确率正常了。细心的同学会发现验证集的f1分数与训练集还是有差距的,这是因为我为了测试模型所以只用了1500个样本训练,过拟合也很正常。

如果想解冻base_model的后N层,可以先运行下面代码,看看一共有多少层并且都是哪些层:

for i, layer in enumerate(base_model.layers): print(i, layer.name)

再根据需求解冻后N层

for layer in model.layers[:-N]: layer.trainable = False for layer in model.layers[-N:]: layer.trainable = True

解决了问题的同学,留个赞再走呀? 

参考资料:https://github.com/keras-team/keras/pull/9965#discussion_r187806860



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有